Lab book for analyses using hierachal computational modelling to identify paramters that define the best model of learning as it applies to fear conditioning acquisition and extinction using FLARe fear conditioning data. Long abstract, justification and analysis plan found in prelim manuscript here
In short:
Identify model of learning based on a priori hypotheses that best fits the trajectories of fear relevant learning in our FLARe dataset
Cross validate best fitting model in TEDS data
Are these parameters associated with other emasures of indsividual differences in our datasets?
Evidence from both human (Richter et al., 2012) and rodent (Galatzer-Levy, Bonanno, Bush, & LeDoux, 2013) studies suggest that trajectories of how we learn and extinguish fear differ between individuals. Different trajectories of fear and extinction have also been found using fear conditioning studies (e.g. Duits et al., 2016), a good model for the learning of, and treatment for, fear and anxiety disorders. It is likely that these trajectories of fear extinction might predict outcomes in exposure-based cognitive behavioural therapy (Kindt, 2014).
Identifying parameters that predict individual trajectories of fear learning and extinction will enable us to harness fear conditioning data more effectively to aid in understanding mechanisms underlying the development of and treatment for anxiety disorders. With more accurate models of these processes, the potential to use fear conditioning paradigms to predict those most at risk of developing an anxiety disorder, and those who might respond best to exposure-based treatments, greatly improves.
Sutton and Barto Reinforcement Learning - Textbook on reinforcement learning
Anxiety promotes memory for mood-congruent faces but does not alter loss aversion (Charpentier…Robinson, 2015) - Good example of a sensitivity learning parameter
Hypotheses About the Relationship of Cognition With Psychopathology Should be Tested by Embedding Them Into Empirical Priors (Moutoussist et al., 2018) - Including variables of interest (e.g. anxiety) in the model
Toby Wise has just submitted an aversive learning paper incorporating beta probability distributions in the best model for uncertain learning paramters etc.
A copy of this is
Define set of a priori models moving from simple to more complex
Run each model and compare fit in FLARe pre TEDS data
Select best fitting model
Extract individual data for learning parameters from this model and see what factors best predict it
Run all models again in FLARe TEDS
Will use a combination of R.Version(3.5.1), RStan (Version 2.18.2, GitRev: 2e1f913d3ca3) and hBayesDM package in R (3.5.1) Ahn, W.-Y., Haines, N., & Zhang, L. (2017). Revealing neuro-computational mechanisms of reinforcement learning and decision-making with the hBayesDM package. Computational Psychiatry, 1, 24-57., which uses RStan
Discussion with Vince Valton and Alex Pike about the best way to fit this model. As the observed outcomes (expectancy ratings) are non binary and are related to eachother (i.e. as you become more likely to select 9, you become less likely to select 1) we should consider each trial for each person for each stimulus as a constantly updating beta distribution. so you might see a pattern like this for the CS+ in acq for example.
So, best model is likely to be one using beta distributions that show the probability distribution for each rating.
We can use sufficient parameters to describe these (i.e. mean / sd or possibly the mode)
A useful intuition of the beta distribtion can be found here
and a useful website here
scaling
We can scale the beta by how aversive participants find the shock. i.e. it might update their learning as if there was .5 a shock or 1.5 of a shock depending on their own sensitivity to the aversiveness / punishment.
alpha
generalisation
We can do this with a single beta distribution for each phase (collapsing over the two stimuli). This would be akin to a per phase generalisation paramaterer in that it will be smaller if they tend to choose the same expectancy for both stimuli and larger if they tend to choose very differently for both stimuli.
However, because these variables are not really equivalent (i.e the reinforcement rate is different for both, and we use this in the model)
So instead we can create a paramater which is the value of cs- weighted by some value of the cs+. How much each individual weights by the Cs+ can be freely estimated by the model and can be the generalisation paramter.
So this would be vminus = vminus + (w)vplus (where the w paramter is the freely estimated paramter per person)
per stimulus We probably want to model cs+ and cs- separately too - so have a beta distribution characterised by sufficient parameters for each.
per trial
All of the above can then also be done with updating per trial.
leaky beta
we also need a model that incorporates ‘leak’. i.e. learning leak - likely that participants will update more based on more recent trials and learn less from the more distant trials as time progresses. See Toby’s paper for more.
uncertainty
We should consider incorportating a paramter that maps to participant uncertainty about outcomes.
anxiety
Might be worth incorporating this as a model paramater / feature. Read this for more.
As we are using a beta distribution, we will calculate log likelihood based on the probability function for the distribution (i.e. where will the peak of the shape be) given the participants response at each trial. So will add the probability density function given each trial response trial by trial for each of the CS+ and - summed together.
Will obtain 1 log likelihood and then 1 per trial and add together to make sure that these are comparable.
the basic stan terminology for this is below:
beta_lpdf(rating[t,p]|shape1[t,p],shape2[t,p])
where beta_lpdf is the probability density given the rating made and each of the two beta distribution shape paramters that we estimate.
This is what we will use to compare models.
V == ‘value’. Baasically a paramter that is about the salience of the stimulus at any given point.
alpha == ‘learning rate’. A parameter that describes how sensitive people are to updating their learning. So a fast learning rate means that learning on any given trial is weighted more based on the trials immediatly preceding than past ones, and a slow learning rate means that all past events influence learning more evenly. Alex’s tennis analogy is good here (Federer - stable player, can predict a win based on all matches; Murray - volatile player; his last match is best predictor of next match performance). beta == ‘confidence’. This is sort of an error term - how much variance in rating choice is there for each person/trial. Can be thought of as the variance, or beta^2 as the sd.
Can be confusing as we are using beta distributions (different thing) which has two sufficient parameters a + b).
and how they change depending on whether you change the beta or alpha paramters.
A really nice summary visualisation
Here are some simulations I can change and play with the illustrate the same sort of thing.
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [1] 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65
## [15] 0.70 0.75 0.80 0.85 0.90 0.95 1.00
## [1] "stable beta, increasing alpha"
## [1] "stable alpha, increasing beta"
Will probably do all per trial. Will do an early sensitivity check to confirm this.
Alpha Learning rate paramter. If high then will be very influenced by previous trial events, if low, then will be more standardly influenced by accumulating events.
Betas Variance/certainty parameter
These use Alex Pikes RStan script with minor modification to make it punishment only to see if it runs. Testing that the approach works with the current data set up etc.
The settings for the script are below, including stan chain paramters and directory set up.
## Warning: package 'tibble' was built under R version 3.5.2
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
## ✔ tidyr 0.8.2 ✔ dplyr 0.8.0.1
## ✔ readr 1.3.1 ✔ stringr 1.4.0
## ✔ ggplot2 3.1.0 ✔ forcats 0.3.0
## Warning: package 'dplyr' was built under R version 3.5.2
## Warning: package 'stringr' was built under R version 3.5.2
## ── Conflicts ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ ggplot2::%+%() masks psych::%+%()
## ✖ ggplot2::alpha() masks psych::alpha()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
This loads the libraries and source files needed to run this script, and sets up RStan
## Warning: package 'StanHeaders' was built under R version 3.5.2
## Warning: package 'data.table' was built under R version 3.5.2
the below is a function that will format your numbers to one decimal place using sprintf
only doing this ‘accurately’ for the acquisition CS+, as the simulations require probability. I am using contingency for this (0.75). If set for 0 for all other phases and stimuli then it looks as if the learning should be flat regardless of alpha. We expect in reality that this probability will vary between people and will be unlikely to be zero. So test 12 and 18 trials with a probability of 0.5 and 0.2 as well.
## [1] "Simulated learning rates. 12 trials; probability = 0.75 (CSp acq contingency) \n"
##### 12 trials; probability = 0.5
## [1] "Simulated learning rates. 12 trials; Probability = 0.5\n"
## [1] "Simulated learning rates. 12 trials; Probability = 0.2\n"
## [1] "Simulated learning rates. 18 trials; Probability = 0.5\n"
## [1] "Simulated learning rates. 18 trials; Probability = 0.2\n"
##
## Attaching package: 'reshape2'
## The following objects are masked from 'package:data.table':
##
## dcast, melt
## The following objects are masked from 'package:reshape':
##
## colsplit, melt, recast
## The following object is masked from 'package:tidyr':
##
## smiths
See if the basic punishment only learning model for the CS+ and CS- works with the FLARe master data
From the rstan github
This is to check that all is compiling and working and to give and idea of data format etc.
load in the week 1 app and lab data for FLARe pilot, TRT and headphones studies. Make it long form.
Try with acquisition data first. This is formatted with no column names, with no missing data.
Derive the n parameter for both files and check these match
stanname='punish_only.stan'
minus_name <- 'bayes_acq_minus.csv'
plus_name <- "bayes_acq_plus.csv"
stanfile <- file.path(scriptdir, stanname)
minusfile <- file.path(datadir,minus_name)
plusfile <- file.path(datadir,plus_name)
minus <- fread(minusfile,data.table=F)
plus <-fread(plusfile,data.table=F)
nacqm <- dim(minus)[1]
nacqp <- dim(plus)[1]
## check that these match and create nsub variable for RStan
if (nacqm == nacqp) {
print('subject number match')
nsub <- nacqm
print(paste('nsub set to',nsub,sep=" "))
} else {
print('WARNING: subject number does not match. Check master dataset')
}## [1] "subject number match"
## [1] "nsub set to 342"
The expectancy rating datasets look like they are formatted fine and ntrials and nsub variables should exist.
Need to go back to stage zero and keep scream yes/no as a variable. For now to see if this runs create simulated version for the CS+. CS- will remain the same.
screamMinus <- matrix(0L,nrow=nsub, ncol=ntrials)
# Initialise plus dataset in the same way, but make the first trial 1 for everyone, then add 8 additional random 1's per person. Do this in four random patterns to mimic the real data
sc1 <- c(1,1,0,1,0,0,1,1,1,1,1)
sc2 <- c(0,1,1,1,0,0,1,1,1,1,1)
sc3 <- c(1,1,1,0,1,0,1,0,1,1,1)
sc4 <- c(1,0,1,1,0,0,1,1,1,1,1)
screamPlus <- matrix(0L,nrow=nsub, ncol=ntrials)
screamPlus[,1] <- 1
# for (n in 1:dim(screamPlus)[1]) {
# print(n)
# screamPlus[n,2:12] <- sample(patts,1,replace=T)
# }
for (n in 1:dim(screamPlus)[1]) {
a <- sample(c(1,4),1)
if (a == 1) {
screamPlus[n,2:12] <- sc1
} else if (a == 2) {
screamPlus[n,2:12] <- sc2
} else if (a == 3){
screamPlus[n,2:12] <- sc3
} else {
screamPlus[n,2:12] <- sc4
}
}for now to see if stan runs using bernoulli-logit function make binary resposnes from expectancy i.e. >=4.5 ==1, <= 4.5 ==0.
This directs to my local machine here /Users/kirstin/Dropbox/SGDP/FLARe/FLARe_MASTER/Projects/Hierachal_modelling/Scripts and is remotely linked to the github repository here.
## Already up to date.
Unhash this if you want to check what the model looks like within the notebook.
use echo to push these to the new file if you want to make changes from here.
unhash this to run experimental script that checked if stan runs. This was mostly to check data formatting and installation / compilation etc.
flare_data<-list(ntrials=ntrials,nsub=nsub,includeTrial = rep(1,ntrials), screamPlus=t(screamPlus),screamMinus=t(screamMinus),
ratingPlus=t(plusb),ratingMinus=t(minusb))
#flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
#save(flare_fit, file=file.path(datadir,'flare_fit_test'))
#traceplot(flare_fit,'lp__')
# extract fit data
#summary_flare<- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)view the fit information
extract the loglikelihood using loo
so, good news is this all works. So preliminary check a success. Next need to consider the appropriate model.
We need to rescale our dataset here to be between 0 and 1.
Importantly, because we are using the proportion of trials that are not reinforced as a known paramter for statistical reasons (we don’t want a proportion of .75 and 1, better to have .25 and 0), we have made our rescaled expectancy values as 1 - rescaled(x). This means that we will still be able to interpret the results in the expected way (i.e. higher rating is greater expectation of the outcome).
rescale the 1-9 expectancy values to be on a 0-1 scale.
stan cannot deal with the extreme limit of the beta, so make the rescaled limits just above 0 and below one
Note that when a value had to be imputed as it was missing it will not be an integer. Thus the function needs to allow for ranges between values.
##
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
##
## discard
## The following object is masked from 'package:readr':
##
## col_factor
## The following objects are masked from 'package:psych':
##
## alpha, rescale
# rescale and flip so that we are effectively rating the expectation that they WILL NOT hear a scream to match stan
## rescaling such that the distribution spaces the numbers 1-9 evenly. the first interval upper bound would be 0.11, then 0.22 etc. this means that the mid point of each itnerval will be:
print("mid point of each evenly spaced interval representing values between 1-9")## [1] "mid point of each evenly spaced interval representing values between 1-9"
## [1] 0.05555556 0.16666667 0.27777778 0.38888889 0.50000000 0.61111111
## [7] 0.72222222 0.83333333 0.94444444
## thus 1 will be 1-0.055 etc.
## NOTE: might want to consider making this more flexible. enter in the numer of choice options as a variable - would be very easy. add to function library at later stage
scale_flare <- function(x){
vals <- seq(0.5/9,1,1/9)
for (val in 1:9){
if (x > val-1 & x <= val){
x <- 1 - vals[val]
}
}
return(x)
}
## initialise minus_scaled dataframe.
minus_scaled <- data.frame(matrix(ncol=dim(minus)[2],nrow = dim(minus)[1]))
## populate with rexcaled values
for (sub in 1:dim(minus)[1]){
for (col in 1:dim(minus)[2]){
minus_scaled[sub,col] <- scale_flare(minus[sub,col])
}
}
## ditto for plus
plus_scaled <- data.frame(matrix(ncol=dim(minus)[2],nrow = dim(minus)[1]))
for (sub in 1:dim(plus)[1]){
for (col in 1:dim(plus)[2]){
plus_scaled[sub,col] <- scale_flare(plus[sub,col])
}
}
## this is the number that will take from the midpoint to the top and bottom for the new boundaries (with ratings representing the midpoint)
cdf_scale <- 1/18This is a vector containing the absolute number of trials where no scream occurred for each stimulus. As there was a 75% reinforcement rate for the CS+ (9/12 trials), this is a vector of ’3’s. For the CS-, no trials were reinforced so is a vector of ’12’s
Create datasets for the acquisition CS- and extinction CS+ and CS- reflecting that no screams occurred at all. Then use the pattern id variable to create a dataset for the acquisition CS+ indicating when a scream occurred for each participant.
## Create the no scream daatsets for all
screamMinus <- matrix(0L,nrow=nsub, ncol=ntrials)
# Initialise plus dataset in the same way, but make the first trial 1 for everyone, then add 8 additional random 1's per person. Do this in four random patterns to mimic the real data
sc1 <- c(1,1,0,1,0,0,1,1,1,1,1)
sc2 <- c(0,1,1,1,0,0,1,1,1,1,1)
sc3 <- c(1,1,1,0,1,0,1,0,1,1,1)
sc4 <- c(1,0,1,1,0,0,1,1,1,1,1)
screamPlus <- matrix(0L,nrow=nsub, ncol=ntrials)
screamPlus[,1] <- 1
# for (n in 1:dim(screamPlus)[1]) {
# print(n)
# screamPlus[n,2:12] <- sample(patts,1,replace=T)
# }
for (n in 1:dim(screamPlus)[1]) {
a <- sample(c(1,4),1)
if (a == 1) {
screamPlus[n,2:12] <- sc1
} else if (a == 2) {
screamPlus[n,2:12] <- sc2
} else if (a == 3){
screamPlus[n,2:12] <- sc3
} else {
screamPlus[n,2:12] <- sc4
}
}Because we use the 1-rescaled expectancy data, no need to try and invert to reinforcement paramters here. As a result we need the stan model to simply be:
alphaPlus[p] = nothingPlus[p]/ntrials;
alphaMinus[p] = nothingMinus[p]/ntrials;
here we try to estimate the alpha paramter of the beta distribution per trial per person per stimulus. (i.e. you have two sufficient paramters for each beta dist, the alpha and beta. we want to estimate the alpha - ).
Eventually we will scale these by the actual ‘value’ of the scream for each person per trial.
Using data loaded in from preliminary tests above.
so this is a beta value per person (assuming the underlying process for the plus and minus are the same)
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_noscaling.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,nothingPlus = No_scream_p, nothingMinus=No_scream_m,ratingsPlus=plus_scaled,ratingsMinus=minus_scaled)
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_noscaling' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.002004 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 20.04 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 120 / 400 [ 30%] (Warmup)
## Chain 1: Iteration: 160 / 400 [ 40%] (Warmup)
## Chain 1: Iteration: 200 / 400 [ 50%] (Warmup)
## Chain 1: Iteration: 201 / 400 [ 50%] (Sampling)
## Chain 1: Iteration: 240 / 400 [ 60%] (Sampling)
## Chain 1: Iteration: 280 / 400 [ 70%] (Sampling)
## Chain 1: Iteration: 320 / 400 [ 80%] (Sampling)
## Chain 1: Iteration: 360 / 400 [ 90%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 13.2378 seconds (Warm-up)
## Chain 1: 3.82097 seconds (Sampling)
## Chain 1: 17.0588 seconds (Total)
## Chain 1:
## Warning in sqrt(ess): NaNs produced
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)## [1] "400 iterations on 1 chains. "
## [1] "Estimated 1 Free paramaters per person"
Simple alteration of the first model. We estimate a scaling parameter per person over all trials and apply this to alpha component per participant.
here we try to estimate the alpha paramter of the beta distribution per trial per person per stimulus. (i.e. you have two sufficient paramters for each beta dist, the alpha and beta. we want to estimate the alpha - ).
Eventually we will scale these by the actual ‘value’ of the scream for each person per trial.
Using data loaded in from preliminary tests above.
so this is a beta value per person (assuming the underlying process for the plus and minus are the same)
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_scaling.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,nothingPlus = No_scream_p, nothingMinus=No_scream_m,ratingsPlus=plus_scaled,ratingsMinus=minus_scaled)
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_scaling' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.002054 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 20.54 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 120 / 400 [ 30%] (Warmup)
## Chain 1: Iteration: 160 / 400 [ 40%] (Warmup)
## Chain 1: Iteration: 200 / 400 [ 50%] (Warmup)
## Chain 1: Iteration: 201 / 400 [ 50%] (Sampling)
## Chain 1: Iteration: 240 / 400 [ 60%] (Sampling)
## Chain 1: Iteration: 280 / 400 [ 70%] (Sampling)
## Chain 1: Iteration: 320 / 400 [ 80%] (Sampling)
## Chain 1: Iteration: 360 / 400 [ 90%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 26.3089 seconds (Warm-up)
## Chain 1: 55.3722 seconds (Sampling)
## Chain 1: 81.6811 seconds (Total)
## Chain 1:
## Warning: There were 1 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
## http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
## Warning: Examine the pairs() plot to diagnose sampling problems
# extract fit data
summary_flare<- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)## [1] "400 iterations on 1 chains. "
## [1] "Estimated 2 Free paramaters per person"
here we try to estimate the alpha paramter of the beta distribution per trial per person per stimulus. (i.e. you have two sufficient paramters for each beta dist, the alpha and beta. we want to estimate the alpha - ).
Eventually we will scale these by the actual ‘value’ of the scream for each person per trial.
Using data loaded in from preliminary tests above.
so this is a beta value per person (assuming the underlying process for the plus and minus are the same)
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_withRL.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_withRL' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.00454 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 45.4 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 120 / 400 [ 30%] (Warmup)
## Chain 1: Iteration: 160 / 400 [ 40%] (Warmup)
## Chain 1: Iteration: 200 / 400 [ 50%] (Warmup)
## Chain 1: Iteration: 201 / 400 [ 50%] (Sampling)
## Chain 1: Iteration: 240 / 400 [ 60%] (Sampling)
## Chain 1: Iteration: 280 / 400 [ 70%] (Sampling)
## Chain 1: Iteration: 320 / 400 [ 80%] (Sampling)
## Chain 1: Iteration: 360 / 400 [ 90%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 38.9849 seconds (Warm-up)
## Chain 1: 56.5238 seconds (Sampling)
## Chain 1: 95.5087 seconds (Total)
## Chain 1:
# extract fit data
summary_flare <- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)## [1] "400 iterations on 1 chains. "
## [1] "Estimated 2 Free paramaters per person"
this model includes an alpha learning paramater per person estimating their learning rate and updating based on it. This model needs a dataset that indicates whether a scream occurred for each trial instead of the proportion of times no scream occurred.
this model includes an alpha learning paramater per person estimating their learning rate and updating based on it. This model needs a dataset that indicates whether a scream occurred for each trial instead of the proportion of times no scream occurred.
Alex used this stack post to help solve the shape paramters using mean and sd where we assume that v serves as the mean and beta as the sd.
the equations work out to this:
for shape 1:
\[\alpha = \left(\frac{1-\mu}{\sigma^2} - \frac{1}{\mu}\right)\mu^2\]
for shape 2:
\[\beta=\alpha \left(\frac{1}{\mu}-1\right)\]
Hashed this out as it doesnt run!
# ## decide testing rate (min,med,max or off)
# testing('min')
#
# ## set up run
# stanname='beta_meansd_RL.stan'
#
# stanfile <- file.path(scriptdir, stanname)
#
# flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
#
# flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
#
# save(flare_fit, file=file.path(datadir,'flare_fit_test'))
#
# traceplot(flare_fit,'lp__')
#
# # extract fit data
# summary_flare<- summary(flare_fit)
#
# # extract model summary data
#
# #flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)On 500 iterations (i.e. test) the variance in alpha is good, but the traceplot is terrible. Model coverges very poorly. We also have to constrain the beta to be betwqeen 0 and 0.0001. Not sure why this is.
when running for 2000 iterations (1000 warmup)…
This results in the following warning;
There were 2644 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmupThere were 4 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceededThere were 4 chains where the estimated Bayesian Fraction of Missing Information was low. See http://mc-stan.org/misc/warnings.html#bfmi-lowExamine the pairs() plot to diagnose sampling problems
The above mean definition does not map the data well (terrible traceplot!). I found this from the MRC BSU and have tried defining the beta parameters assuming V == mean in a slighty different way:
for paramater a:
\[\alpha = \mu\beta/(1-\mu)\]
for parameter b:
\[\beta = \mu(1-\mu)^2/\sigma+\mu-1\]
Still using a single beta here.
hashed this out as it doesnt run (saves time)
# ## decide testing rate (min,med,max or off)
# testing('min')
#
# ## set up run
# stanname='beta_meansd_RL_2.stan'
#
# stanfile <- file.path(scriptdir, stanname)
#
# flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
#
# flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
#
# save(flare_fit, file=file.path(datadir,'flare_fit_test'))
#
# traceplot(flare_fit,'lp__')
#
# # extract fit data
# summary_flare<- summary(flare_fit)
#
# # extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)noted that the shape parameters have slight variations in definition according to discussion here. Updated the script slightly to reflect this based on the reply from ocram.
the first sd term in shape a is changed to variance, so it changes from:
\[\alpha = \left(\frac{1-\mu}{\sigma^2} - \frac{1}{\mu}\right)\mu^2\]
to
\[\alpha = \left(\frac{1-\mu}{\sigma} - \frac{1}{\mu}\right)\mu^2\]
Changes the shape 2 paramter definition from:
\[\beta=\alpha \left(\frac{1}{\mu}-1\right)\]
to
\[\beta = \left(\frac{1-\mu}{\sigma} - \frac{1}{\mu}\right)\mu\left(1-\mu\right)\]
Because this works best, will add loglikelihhod calculation here. Basing this on the probability density function for the beta distribution given the participants actual ratings and sufficient paramters of the distribution per trial.
loglik[p] = loglik[p] + beta_lpdf(ratingsPlus[t,p]|shape1_Plus[t,p],shape2_Plus[t,p]) + beta_lpdf(ratingsMinus[t,p]|shape1_Minus[t,p],shape2_Minus[t,p])
Hashed this out as it doesnt run (saves time)
## decide testing rate (min,med,max or off)
# testing('min')
#
# ## set up run
# stanname='beta_meansd_RL_3.stan'
#
# stanfile <- file.path(scriptdir, stanname)
#
# flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
#
# flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
#
# save(flare_fit, file=file.path(datadir,'flare_fit_test'))
#
# traceplot(flare_fit,'lp__')
#
# # extract fit data
# summary_flare<- summary(flare_fit)
#
# # extract model summary data
#
# flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)This model is substnatially better than either of the other two. Traceplot suggests that the iterations converge as we would like. However, we still need to massively constrain the beta (i.e. confidence / uncertainty) estimates for it to run, otherwise the starting values drop below zero.
## extract log likelihood
#
# flare_loglike <- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
#
# #calculate BIC
#
# FLARe_bic<-bic(ntrials,-colMeans(flare_loglike),2) #number of parameters in that model e.g. 4)
#
# ## mean BIC as model comparisons tool:
#
# print("Mean Bayesian information criterion for model")
# mean(FLARe_bic)Here I try to define the parameter using simplified mean and precision estimates as per this tutorial. See in particular the paramter estimation on the cubs data.
This results in a relatively simplified paramter estimation compared to model 3.
\[\alpha = \mu * ((\mu * (1-\mu)) / \sigma - 1)\]
where mu is the mean (or value) and sigma is the variance / uncertainty paramter we currently call beta.
and the b (or shape 2) parameter for the distribution is:
\[\beta = (1- \mu) * ((\mu * (1-\mu)) / \sigma - 1)\]
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_meansd_RL_4.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled),cdf_scale=cdf_scale)
flare_fit_best <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_meansd_RL_4' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.010888 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 108.88 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 120 / 400 [ 30%] (Warmup)
## Chain 1: Iteration: 160 / 400 [ 40%] (Warmup)
## Chain 1: Iteration: 200 / 400 [ 50%] (Warmup)
## Chain 1: Iteration: 201 / 400 [ 50%] (Sampling)
## Chain 1: Iteration: 240 / 400 [ 60%] (Sampling)
## Chain 1: Iteration: 280 / 400 [ 70%] (Sampling)
## Chain 1: Iteration: 320 / 400 [ 80%] (Sampling)
## Chain 1: Iteration: 360 / 400 [ 90%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 74.4862 seconds (Warm-up)
## Chain 1: 37.5121 seconds (Sampling)
## Chain 1: 111.998 seconds (Total)
## Chain 1:
save(flare_fit_best, file=file.path(datadir,'flare_fit_simpleMean'))
traceplot(flare_fit_best,'lp__')## [1] "400 iterations on 1 chains. "
## [1] "Estimated 98 Free paramaters per person"
## [1] "This table is very large. Returning only the top 6 entries unless you have set the 3rd function option to 'all'. "
## extract log likelihood
flare_loglike_best <- extract_log_lik(flare_fit_best, parameter_name = "loglik", merge_chains = TRUE)
#calculate BIC
FLARe_bic<-bic(ntrials,-colMeans(flare_loglike_best),2) #number of parameters in that model e.g. 4)
## mean BIC as model comparisons tool:
print("Mean Bayesian information criterion for model")## [1] "Mean Bayesian information criterion for model"
## [1] 100.2561
mod_comp <- rbind(mod_comp,c("Means 1 beta",as.numeric(mean(FLARe_bic))))
mod_comp$BIC <- odp(as.numeric(mod_comp$BIC))## Warning in odp(as.numeric(mod_comp$BIC)): NAs introduced by coercion
Used this post to guide this. particularly:
For a beta distribution with shape parameters a and b, the mode is (a-1)/(a+b-2). Suppose we have a desired mode, and we want to determine the corresponding shape parameters. Here’s the solution. First, we express the “certainty” of the estimate in terms of the equivalent prior sample size, k=a+b, with k≥2. The certainty must be at least 2 because it essentially assumes that the prior contains at least one “head” and one “tail,” which is to say that we know each outcome is at least possible. Then a little algebra reveals: a = mode * (k-2) + 1 b = (1-mode) * (k-2) + 1
For this version we try and estimate the ‘mode’ to be shape 1. KIRSTIN:: explain here
## decide testing rate (min,med,max or off)
#
# testing('skip')
#
# ## set up run
#
# stanname='beta_mode_RL.stan'
#
# stanfile <- file.path(scriptdir, stanname)
#
# flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
#
# flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
#
# save(flare_fit, file=file.path(datadir,'flare_fit_simpleMode'))
#
# traceplot(flare_fit,'lp__')
#
# # extract fit data
# summary_flare<- summary(flare_fit)
#
# # extract model summary data
#
# #flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)For this version we assume that V is the mode (above we assumed it serves as the mean) and beta is the certainty aspect (i.e. k)
What this does is basically treat the expected rating (value) as the a parameter for the distribution (scaled by their certainity - beta) and 1-that value as the b parameter (again, scaled by the uncertainty).
so you have a ratio of their selected value per trial (mode across iterations?) to how far from the highest possible choice they are.
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_mode_RL_2.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus=t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled),cdf_scale=cdf_scale)
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_mode_RL_2' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.008131 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 81.31 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 120 / 400 [ 30%] (Warmup)
## Chain 1: Iteration: 160 / 400 [ 40%] (Warmup)
## Chain 1: Iteration: 200 / 400 [ 50%] (Warmup)
## Chain 1: Iteration: 201 / 400 [ 50%] (Sampling)
## Chain 1: Iteration: 240 / 400 [ 60%] (Sampling)
## Chain 1: Iteration: 280 / 400 [ 70%] (Sampling)
## Chain 1: Iteration: 320 / 400 [ 80%] (Sampling)
## Chain 1: Iteration: 360 / 400 [ 90%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 44.5763 seconds (Warm-up)
## Chain 1: 30.0698 seconds (Sampling)
## Chain 1: 74.646 seconds (Total)
## Chain 1:
## [1] "400 iterations on 1 chains. "
## [1] "Estimated 97 Free paramaters per person"
## [1] "This table is very large. Returning only the top 6 entries unless you have set the 3rd function option to 'all'. "
This works, but there is not a lot of variance in the alpha paramter when described by mode mean 0.49; sd = 0.06. Compared to defined by mean where mean is 0.54 and sd is 0.26.
However there is a lot of variation in the beta paramter (mean -7.21, sd = 134.74)
## extract log likelihood
flare_loglike <- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
#calculate BIC
FLARe_bic<-bic(ntrials,-colMeans(flare_loglike),2) #number of parameters in that model e.g. 4)
## mean BIC as model comparisons tool:
print("Mean Bayesian information criterion for model")## [1] "Mean Bayesian information criterion for model"
## [1] 101.854
mod_comp <- rbind(mod_comp,c("Mode 1 beta",as.numeric(mean(FLARe_bic))))
mod_comp$BIC <- odp(as.numeric(mod_comp$BIC))
mod_comp <- as.data.frame(na.omit(mod_comp))
## plot function - create plot
plot_models(mod_comp) ## Model 5: RL mean defined,two beta {.tabset}
RL model adding a beta per stimulus to Alex’s model
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_meansd_2beta_RL.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled),cdf_scale=cdf_scale)
flare_fit_m2 <- stan(file = stanfile, data = flare_data, iter=chain_iter,warmup = warm_up, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_meansd_2beta_RL' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.010213 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 102.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: There aren't enough warmup iterations to fit the
## Chain 1: three stages of adaptation as currently configured.
## Chain 1: Reducing each adaptation stage to 15%/75%/10% of
## Chain 1: the given number of warmup iterations:
## Chain 1: init_buffer = 15
## Chain 1: adapt_window = 75
## Chain 1: term_buffer = 10
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 101 / 400 [ 25%] (Sampling)
## Chain 1: Iteration: 140 / 400 [ 35%] (Sampling)
## Chain 1: Iteration: 180 / 400 [ 45%] (Sampling)
## Chain 1: Iteration: 220 / 400 [ 55%] (Sampling)
## Chain 1: Iteration: 260 / 400 [ 65%] (Sampling)
## Chain 1: Iteration: 300 / 400 [ 75%] (Sampling)
## Chain 1: Iteration: 340 / 400 [ 85%] (Sampling)
## Chain 1: Iteration: 380 / 400 [ 95%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 46.8227 seconds (Warm-up)
## Chain 1: 106.611 seconds (Sampling)
## Chain 1: 153.433 seconds (Total)
## Chain 1:
## [1] "400 iterations on 1 chains. "
## [1] "Estimated 99 Free paramaters per person"
## [1] "This table is very large. Returning only the top 6 entries unless you have set the 3rd function option to 'all'. "
## extract log likelihood
flare_loglike_m2 <- extract_log_lik(flare_fit_m2, parameter_name = "loglik", merge_chains = TRUE)
#calculate BIC
FLARe_bic_m2 <- bic(ntrials,-colMeans(flare_loglike_m2),3) #number of parameters in that model e.g. 4)
# mean for all participants
mean(FLARe_bic_m2)## [1] 101.5727
RL model adding a beta per stimuli to model defining the beta shape using the mode instead of the mean. This definitely makes more sense as we assume that they will have different levels of uncertainty about each.
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_mode_2beta_RL_2.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_mode_2beta_RL_2' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.006392 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 63.92 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 120 / 400 [ 30%] (Warmup)
## Chain 1: Iteration: 160 / 400 [ 40%] (Warmup)
## Chain 1: Iteration: 200 / 400 [ 50%] (Warmup)
## Chain 1: Iteration: 201 / 400 [ 50%] (Sampling)
## Chain 1: Iteration: 240 / 400 [ 60%] (Sampling)
## Chain 1: Iteration: 280 / 400 [ 70%] (Sampling)
## Chain 1: Iteration: 320 / 400 [ 80%] (Sampling)
## Chain 1: Iteration: 360 / 400 [ 90%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 54.7869 seconds (Warm-up)
## Chain 1: 34.1505 seconds (Sampling)
## Chain 1: 88.9374 seconds (Total)
## Chain 1:
# extract fit data
summary_flare <- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)## [1] "400 iterations on 1 chains. "
## [1] "Estimated 98 Free paramaters per person"
## [1] "This table is very large. Returning only the top 6 entries unless you have set the 3rd function option to 'all'. "
The alpha parameter variance is normal (mean 0.4 and sd 0.12). Beta is much more bounded now though (combined across both stimuli mean 0.79, sd=1.6) over 4000 iterations on 4 chains.
## extract log likelihood
flare_loglike <- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
#calculate BIC
FLARe_bic<-bic(ntrials,-colMeans(flare_loglike),2) #number of parameters in that model e.g. 4)
## mean BIC as model comparisons tool:
print("Mean Bayesian information criterion for model")## [1] "Mean Bayesian information criterion for model"
## [1] 101.9691
The beta doesnt work as well for the CS+ stimulus, need to check if this paramter adds anything to the model - drop it from our best mean model and see how this changes the fit.
this takes for ever to run and the logliklihood fails. So no idea if it is good yet - come back to this. hashed for now
#
# ## decide testing rate (min,med,max or off)
# testing('min')
# # try a weakly informative prior
# #N0w <- normal(0, 100)
#
#
# ## set up run
# stanname='beta_meansd_RL_NoBeta.stan'
#
# stanfile <- file.path(scriptdir, stanname)
#
# flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
#
# flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?
#
# save(flare_fit, file=file.path(datadir,'flare_fit_test'))
#
# traceplot(flare_fit,'lp__')
#
# # extract fit data
# summary_flare <- summary(flare_fit,na.rm=T)
#
# # extract model summary data
#
# #flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)#
# ## extract log likelihood
#
# flare_loglike <- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
#
# #calculate BIC
#
# FLARe_bic<-bic(ntrials,-colMeans(flare_loglike),2) #number of parameters in that model e.g. 4)
#
# ## mean BIC as model comparisons tool:
#
# print("Mean Bayesian information criterion for model")
# mean(FLARe_bic)Here I test whether the model is working well by seeing if I can use the parameters we’ve estimated to try and generate our existing rating data and then recover similar paramters again.
I will do this for the best fitting model (mean defined beta distribution with a variance estimate per person for eahc stimulus) This is the model where we treat the iterarted ratings as if they are ‘expected’ values and use this as shape 1 paramter for our beta distribution at each trial. We have allowed a beta (or uncertainty) paramter per stimulus.
A good model will have a) a good correlation between real data and the data generated and b) a good correlation between the parameter estimates from the real and generated data.
We basically want to replicate our stan script, but inatead of estimating paratmers, we want to assume that we know what the parameters are (i.e. use the alpha and beta’s we have estimated previously).
update: turns out single beta is the best fitting model when I correct my BIC function to include the negative log likelihood. So will also generate and recover for this model and use this as the comparator.
Use the summary of the stan model to extract the different paramters we want to try to use to recreate our data.
rating_est_plus <- data.frame(matrix(ncol=ntrials,nrow=nsub))
rating_est_minus <- data.frame(matrix(ncol=ntrials,nrow=nsub))
# beta shape paramters
shape1p <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape1m <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape2p <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape2m <- data.frame(matrix(ncol=ntrials,nrow=nsub))
# V parameters (initialised at 0.5)
vp <- data.frame(matrix(ncol=ntrials,nrow=nsub))
vm <- data.frame(matrix(ncol=ntrials,nrow=nsub))
vp[1] <- 0.5
vm[1] <- 0.5
# prediction error
dp <- data.frame(matrix(ncol=(ntrials-1),nrow=nsub))
dm <- data.frame(matrix(ncol=(ntrials-1),nrow=nsub)) Use our extracted paramters in place of estimating the same. Use the stan syntax
use the alpha paramters we’ve extracted (alpha_est) d == delta (precdiction error) v == value (i.e. value for each stimulus)
for (p in 1:nsub){
for (t in 1:(ntrials-1)){
dp[p,t] <- screamPlus[p,t]-vp[p,t]
dm[p,t] <- screamMinus[p,t]-vm[p,t]
vp[p,t+1] <- vp[p,t]+alpha_est[p,1]*dp[p,t]
vm[p,t+1]<- vm[p,t]+alpha_est[p,1]*dp[p,t]
}
}for (t in 1:ntrials){
shape1_Plus[t,p] = VPlus[t,p] * ((VPlus[t,p] * (1-VPlus[t,p])) / beta[p,1]);
shape1_Minus[t,p] = VMinus[t,p] * ((VMinus[t,p] * (1-VMinus[t,p])) / beta[p,2]);
shape2_Plus[t,p] = (1-VPlus[t,p]) * ((VPlus[t,p] * (1-VPlus[t,p])) / beta[p,1]);
shape2_Minus[t,p] = (1-VMinus[t,p]) * ((VMinus[t,p] * (1-VMinus[t,p])) / beta[p,2]);
ratingsPlus[t,p] ~ beta(shape1_Plus[t,p],shape2_Plus[t,p]);
ratingsMinus[t,p] ~ beta(shape1_Minus[t,p],shape2_Minus[t,p]);
}
} }
Use the new v frames and beta parameters.
Shape 1 and 2 are sufficient paramters for the beta distribution
for (p in 1:nsub){
for (t in 1:ntrials){
shape1p[p,t] = vp[p,t] * ((vp[p,t] * (1-vp[p,t])) / beta[p,1])
shape1m[p,t] = vm[p,t] * ((vm[p,t] * (1-vm[p,t])) / beta[p,1])
shape2p[p,t] = (1-vp[p,t]) * ((vp[p,t] * (1-vp[p,t])) / beta[p,1])
shape2m[p,t] = (1-vm[p,t]) * ((vm[p,t] * (1-vm[p,t])) / beta[p,1])
}
}trying to use pbeta here (derives the distribution function givemn a set of probabilities)
For now, setting probabilities between 0 and 1 and taking the average…
You could argue that these should match the discrete scale nature of the original ratings. We effectively undid this in our script. The following will enable this.
HOWEVER: we are redcucing variance massively this way, so think it might be better to leave the recovered ratings unscales….
So - the following discrete values exist in our rescaled ratings:
##
## 0.0555555555555556 0.166666666666667 0.277777777777778
## 4 3 20
## 0.388888888888889 0.5 0.611111111111111
## 19 144 26
## 0.722222222222222 0.833333333333333 0.944444444444444
## 18 18 90
Will make it that anything that falls 0.05555556 above or below one of these values is set to this median point. Note that this is our cdf_scale factor that we used in the script to capture the full area under the curve for each segment of the distribution represented by the discrete ratings of 1-9.
Write the function to rescale
scale_simulated <- function(x){
scaled_list <- array(unique(plus_scaled$X1))
for (val in scaled_list[1:length(scaled_list)]){
if (x > val-cdf_scale & x < val+cdf_scale){
x <- val
}
}
return(x)
}apply it to the simulated rating frames.
(unhash to run this)
## initialise dataframes
#
# est_plus_scaled <- data.frame(matrix(ncol=dim(rating_est_plus)[2],nrow = dim(rating_est_plus)[1]))
# est_minus_scaled <- data.frame(matrix(ncol=dim(rating_est_minus)[2],nrow = dim(rating_est_minus)[1]))
#
# ## populate with rescaled values
#
# for (sub in 1:dim(rating_est_plus)[1]){
# for (col in 1:dim(rating_est_plus)[2]){
#
# est_plus_scaled[sub,col] <- scale_simulated(rating_est_plus[sub,col])
# }
# }
#
# for (sub in 1:dim(rating_est_minus)[1]){
# for (col in 1:dim(rating_est_minus)[2]){
#
# est_minus_scaled[sub,col] <- scale_simulated(rating_est_minus[sub,col])
# }
# }use the simulated ratings per person that we have derived using our paramters and see how well they align with the real ratings…
Only showing the diaganols from corr.test package here to get the important t1 x t1 etc values.
this will be using either rating_est files (rating_est_plus;rating_est_minus) or the est_scaled files (est_minus_scaled; est_plus_scaled) depending on whether we opt to return scaling or no
## [1] "real ratings with estimated ratings: CS MINUS"
## X1 X2 X3 X4 X5 X6
## -0.04062651 0.26473471 0.31436224 0.24148588 0.56215013 0.34990452
## X7 X8 X9 X10 X11 X12
## 0.32450766 -0.04362274 0.30391317 0.46121675 0.37782084 0.50161460
## [1] "real ratings with estimated ratings: CS MINUS (average for all trials)"
## [1] 0.6029676
## [1] "real ratings with estimated ratings: CS PLUS"
## X1 X2 X3 X4 X5
## 0.144973857 0.211167199 0.314511846 0.121509543 0.120762502
## X6 X7 X8 X9 X10
## 0.238806813 0.057619883 -0.154140224 0.002674489 0.120125712
## X11 X12
## 0.111760194 0.265633715
## [1] "real ratings with estimated ratings: CS PLUS (average for all trials)"
## [1] 0.3096034
Here we are seeing if we can recover the same estimates using the simulated ratings. Basically run stan but using the estimated ratings instead of the real ones. See if we get the same alpha / beta paramters.
We might decide to use the rescaled estimates here to be more comparable…
RL model adding a beta per stimulus to Alex’s model
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_meansd_RL_4.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(rating_est_plus),ratingsMinus=t(rating_est_minus),cdf_scale=cdf_scale)
flare_fit_rec <- stan(file = stanfile, data = flare_data, iter=chain_iter,warmup = warm_up, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_meansd_RL_4' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.016067 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 160.67 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: There aren't enough warmup iterations to fit the
## Chain 1: three stages of adaptation as currently configured.
## Chain 1: Reducing each adaptation stage to 15%/75%/10% of
## Chain 1: the given number of warmup iterations:
## Chain 1: init_buffer = 15
## Chain 1: adapt_window = 75
## Chain 1: term_buffer = 10
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 101 / 400 [ 25%] (Sampling)
## Chain 1: Iteration: 140 / 400 [ 35%] (Sampling)
## Chain 1: Iteration: 180 / 400 [ 45%] (Sampling)
## Chain 1: Iteration: 220 / 400 [ 55%] (Sampling)
## Chain 1: Iteration: 260 / 400 [ 65%] (Sampling)
## Chain 1: Iteration: 300 / 400 [ 75%] (Sampling)
## Chain 1: Iteration: 340 / 400 [ 85%] (Sampling)
## Chain 1: Iteration: 380 / 400 [ 95%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 84.007 seconds (Warm-up)
## Chain 1: 288.602 seconds (Sampling)
## Chain 1: 372.609 seconds (Total)
## Chain 1:
Use the summary of the stan model to extract the different paramters we want to try to use to recreate our data.
use the simulated ratings per person that we have derived using our paramters and see how well they align with the real ratings…
Only showing the diaganols from corr.test package here to get the important t1 x t1 etc values.
## [1] "original with recovered: ALPHA"
## [1] 0.9877705
## [1] "original with recovered: BETA"
## [1] 0.2878962
Use the summary of the stan model to extract the different paramters we want to try to use to recreate our data.
params <- summary(flare_fit_m2)
alpha_est <- data.frame(params$summary[1:342,1])
beta_plus <- data.frame(matrix(ncol = 1,nrow=342))
beta_minus <- data.frame(matrix(ncol = 1,nrow=342))
names(beta_plus) <- "beta_plus"
names(beta_minus) <- "beta_minus"
subp = 0
subm = 0
for ( i in 343:1026){
if (i%%2 == 1){
subp= subp+1
beta_plus[subp,1] <- params$summary[i,1]
} else if (i%%2 == 0) {
subm= subm+1
beta_minus[subm,1] <- params$summary[i,1]
}
}rating_est_plus <- data.frame(matrix(ncol=ntrials,nrow=nsub))
rating_est_minus <- data.frame(matrix(ncol=ntrials,nrow=nsub))
# beta shape paramters
shape1p <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape1m <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape2p <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape2m <- data.frame(matrix(ncol=ntrials,nrow=nsub))
# V parameters (initialised at 0.5)
vp <- data.frame(matrix(ncol=ntrials,nrow=nsub))
vm <- data.frame(matrix(ncol=ntrials,nrow=nsub))
vp[1] <- 0.5
vm[1] <- 0.5
# prediction error
dp <- data.frame(matrix(ncol=(ntrials-1),nrow=nsub))
dm <- data.frame(matrix(ncol=(ntrials-1),nrow=nsub)) Use our extracted paramters in place of estimating the same. Use the stan syntax
use the alpha paramters we’ve extracted (alpha_est) d == delta (precdiction error) v == value (i.e. value for each stimulus)
for (p in 1:nsub){
for (t in 1:(ntrials-1)){
dp[p,t] <- screamPlus[p,t]-vp[p,t]
dm[p,t] <- screamMinus[p,t]-vm[p,t]
vp[p,t+1] <- vp[p,t]+alpha_est[p,1]*dp[p,t]
vm[p,t+1]<- vm[p,t]+alpha_est[p,1]*dp[p,t]
}
}for (t in 1:ntrials){
shape1_Plus[t,p] = VPlus[t,p] * ((VPlus[t,p] * (1-VPlus[t,p])) / beta[p,1]);
shape1_Minus[t,p] = VMinus[t,p] * ((VMinus[t,p] * (1-VMinus[t,p])) / beta[p,2]);
shape2_Plus[t,p] = (1-VPlus[t,p]) * ((VPlus[t,p] * (1-VPlus[t,p])) / beta[p,1]);
shape2_Minus[t,p] = (1-VMinus[t,p]) * ((VMinus[t,p] * (1-VMinus[t,p])) / beta[p,2]);
ratingsPlus[t,p] ~ beta(shape1_Plus[t,p],shape2_Plus[t,p]);
ratingsMinus[t,p] ~ beta(shape1_Minus[t,p],shape2_Minus[t,p]);
}
} }
Use the new v frames and beta parameters.
Shape 1 and 2 are sufficient paramters for the beta distribution
for (p in 1:nsub){
for (t in 1:ntrials){
shape1p[p,t] = vp[p,t] * ((vp[p,t] * (1-vp[p,t])) / beta_plus[p,1])
shape1m[p,t] = vm[p,t] * ((vm[p,t] * (1-vm[p,t])) / beta_minus[p,1])
shape2p[p,t] = (1-vp[p,t]) * ((vp[p,t] * (1-vp[p,t])) / beta_plus[p,1])
shape2m[p,t] = (1-vm[p,t]) * ((vm[p,t] * (1-vm[p,t])) / beta_minus[p,1])
}
}trying to use pbeta here (derives the distribution function givemn a set of probabilities)
For now, setting probabilities between 0 and 1 and taking the average…
use the simulated ratings per person that we have derived using our paramters and see how well they align with the real ratings…
Only showing the diaganols from corr.test package here to get the important t1 x t1 etc values.
## [1] "real ratings with estimated ratings: CS MINUS"
## <0 x 0 matrix>
## [1] "real ratings with estimated ratings: CS MINUS (average for all trials)"
## [1] 0.5856393
## [1] "real ratings with estimated ratings: CS PLUS"
## X1 X2 X3 X4 X5
## 0.006910869 0.108452244 0.078709419 0.069577475 0.070199112
## X6 X7 X8 X9 X10
## 0.144197346 -0.044196976 -0.057392583 -0.046329064 -0.074406587
## X11 X12
## -0.044108626 0.121188162
## [1] "real ratings with estimated ratings: CS PLUS (average for all trials)"
## [1] 0.05712352
Here we are seeing if we can recover the same estimates using the simulated ratings. Basically run stan but using the estimated ratings instead of the real ones. See if we get the same alpha / beta paramters.
rescale the 1-9 expectancy values to be on a 0-1 scale.
stan cannot deal with the extreme limit of the beta, so make the rescaled limits just above 0 and below one
Note that when a value had to be imputed as it was missing it will not be an integer. Thus the function needs to allow for ranges between values.
#
# minus_scaled_est <- data.frame(matrix(ncol=dim(rating_est_minus)[2],nrow = dim(rating_est_minus)[1]))
#
# ## populate with rexcaled values
#
# for (sub in 1:dim(rating_est_minus)[1]){
# for (col in 1:dim(rating_est_minus)[2]){
#
# minus_scaled_est[sub,col] <- scale_flare(rating_est_minus[sub,col])
# }
# }
#
# ## ditto for plus
#
# plus_scaled_est <- data.frame(matrix(ncol=dim(rating_est_plus)[2],nrow = dim(rating_est_plus)[1]))
#
# ## populate with rexcaled values
#
# for (sub in 1:dim(rating_est_plus)[1]){
# for (col in 1:dim(rating_est_plus)[2]){
#
# plus_scaled_est[sub,col] <- scale_flare(rating_est_plus[sub,col])
# }
# }
# RL model adding a beta per stimulus to Alex’s model
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_meansd_2beta_RL.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(rating_est_plus),ratingsMinus=t(rating_est_minus),cdf_scale=cdf_scale)
flare_fit_rec <- stan(file = stanfile, data = flare_data, iter=chain_iter,warmup = warm_up, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_meansd_2beta_RL' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.018063 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 180.63 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: There aren't enough warmup iterations to fit the
## Chain 1: three stages of adaptation as currently configured.
## Chain 1: Reducing each adaptation stage to 15%/75%/10% of
## Chain 1: the given number of warmup iterations:
## Chain 1: init_buffer = 15
## Chain 1: adapt_window = 75
## Chain 1: term_buffer = 10
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 101 / 400 [ 25%] (Sampling)
## Chain 1: Iteration: 140 / 400 [ 35%] (Sampling)
## Chain 1: Iteration: 180 / 400 [ 45%] (Sampling)
## Chain 1: Iteration: 220 / 400 [ 55%] (Sampling)
## Chain 1: Iteration: 260 / 400 [ 65%] (Sampling)
## Chain 1: Iteration: 300 / 400 [ 75%] (Sampling)
## Chain 1: Iteration: 340 / 400 [ 85%] (Sampling)
## Chain 1: Iteration: 380 / 400 [ 95%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 162.552 seconds (Warm-up)
## Chain 1: 564.447 seconds (Sampling)
## Chain 1: 726.999 seconds (Total)
## Chain 1:
Use the summary of the stan model to extract the different paramters we want to try to use to recreate our data.
params_rec <- summary(flare_fit_rec)
alpha_est_rec <- data.frame(params_rec$summary[1:342,1])
beta_plus_rec <- data.frame(matrix(ncol = 1,nrow=342))
beta_minus_rec <- data.frame(matrix(ncol = 1,nrow=342))
names(beta_plus_rec) <- "beta_plus"
names(beta_minus_rec) <- "beta_minus"
subp = 0
subm = 0
for ( i in 343:1026){
if (i%%2 == 1){
subp= subp+1
beta_plus_rec[subp,1] <- params_rec$summary[i,1]
} else if (i%%2 == 0) {
subm= subm+1
beta_minus_rec[subm,1] <- params_rec$summary[i,1]
}
}use the simulated ratings per person that we have derived using our paramters and see how well they align with the real ratings…
Only showing the diaganols from corr.test package here to get the important t1 x t1 etc values.
## [1] "original with recovered: ALPHA"
## [1] 0.9841045
## [1] "original with recovered: BETA PLUS"
## beta_plus
## 0.2996459
## [1] "original with recovered: BETA MINUS"
## beta_minus
## 0.4695987
How aversive they find the scream reinforcement. Modelling this on the loss aversion paramter in Charpentier et al (see the last page before references),
This will be a single parameter per person, and represents how much the scream influences their ratings.
Based on the paper, will try the following to model this in stan by including it in our value calcs for the CS+ and CS- respectively. we will do this by letting it influence how much their prediction error changes based on whether a scream occurred or not. The prediction error is later used to change the value rating per stimulus
\[d(stimulus,trial) = scream*\lambda-v(stimulus,trial-1)\]
where \[\lambda = sensitivity\\to\\screams\]
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_mean1beta_PunSens.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_mean1beta_PunSens' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.008756 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 87.56 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 120 / 400 [ 30%] (Warmup)
## Chain 1: Iteration: 160 / 400 [ 40%] (Warmup)
## Chain 1: Iteration: 200 / 400 [ 50%] (Warmup)
## Chain 1: Iteration: 201 / 400 [ 50%] (Sampling)
## Chain 1: Iteration: 240 / 400 [ 60%] (Sampling)
## Chain 1: Iteration: 280 / 400 [ 70%] (Sampling)
## Chain 1: Iteration: 320 / 400 [ 80%] (Sampling)
## Chain 1: Iteration: 360 / 400 [ 90%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 85.7719 seconds (Warm-up)
## Chain 1: 61.1478 seconds (Sampling)
## Chain 1: 146.92 seconds (Total)
## Chain 1:
# extract fit data
summary_flare <- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)## [1] "400 iterations on 1 chains. "
## [1] "Estimated 99 Free paramaters per person"
## [1] "This table is very large. Returning only the top 6 entries unless you have set the 3rd function option to 'all'. "
## extract log likelihood
flare_loglike <- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
#calculate BIC
FLARe_bic<-bic(ntrials,-colMeans(flare_loglike),2) #number of parameters in that model e.g. 4)
## mean BIC as model comparisons tool:
print("Mean Bayesian information criterion for model")## [1] "Mean Bayesian information criterion for model"
## [1] 99.89028
mod_comp <- rbind(na.omit(mod_comp),c("Punishment sensitivity",mean(FLARe_bic)))
mod_comp$BIC <- odp(as.numeric(mod_comp$BIC))
mod_comp <- as.data.frame(na.omit(mod_comp))
## plot function - create plot
plot_models(mod_comp)Use the summary of the stan model to extract the different paramters we want to try to use to recreate our data.
rating_est_plus <- data.frame(matrix(ncol=ntrials,nrow=nsub))
rating_est_minus <- data.frame(matrix(ncol=ntrials,nrow=nsub))
# beta shape paramters
shape1p <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape1m <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape2p <- data.frame(matrix(ncol=ntrials,nrow=nsub))
shape2m <- data.frame(matrix(ncol=ntrials,nrow=nsub))
# V parameters (initialised at random value between 0.5 - 0.05 and 0.5+0.05)
vp <- data.frame(matrix(ncol=ntrials,nrow=nsub))
vm <- data.frame(matrix(ncol=ntrials,nrow=nsub))
vp[1] <- rnorm(nsub,0.5,0.05)
vm[1] <- rnorm(nsub,0.5,0.05)
# prediction error
dp <- data.frame(matrix(ncol=(ntrials-1),nrow=nsub))
dm <- data.frame(matrix(ncol=(ntrials-1),nrow=nsub)) Use our extracted paramters in place of estimating the same. Use the stan syntax
use the alpha paramters we’ve extracted (alpha_est) d == delta (precdiction error) v == value (i.e. value for each stimulus)
for (p in 1:nsub){
for (t in 1:(ntrials-1)){
dp[p,t] <- screamPlus[p,t]*lambda[p,]-vp[p,t]
dm[p,t] <- screamMinus[p,t]*lambda[p,]-vm[p,t]
vp[p,t+1] <- vp[p,t]+alpha_est[p,1]*dp[p,t]
vm[p,t+1]<- vm[p,t]+alpha_est[p,1]*dp[p,t]
}
}for (t in 1:ntrials){
shape1_Plus[t,p] = VPlus[t,p] * ((VPlus[t,p] * (1-VPlus[t,p])) / beta[p,1]);
shape1_Minus[t,p] = VMinus[t,p] * ((VMinus[t,p] * (1-VMinus[t,p])) / beta[p,2]);
shape2_Plus[t,p] = (1-VPlus[t,p]) * ((VPlus[t,p] * (1-VPlus[t,p])) / beta[p,1]);
shape2_Minus[t,p] = (1-VMinus[t,p]) * ((VMinus[t,p] * (1-VMinus[t,p])) / beta[p,2]);
ratingsPlus[t,p] ~ beta(shape1_Plus[t,p],shape2_Plus[t,p]);
ratingsMinus[t,p] ~ beta(shape1_Minus[t,p],shape2_Minus[t,p]);
}
} }
Use the new v frames and beta parameters.
Shape 1 and 2 are sufficient paramters for the beta distribution
for (p in 1:nsub){
for (t in 1:ntrials){
shape1p[p,t] = vp[p,t] * ((vp[p,t] * (1-vp[p,t])) / beta[p,1])
shape1m[p,t] = vm[p,t] * ((vm[p,t] * (1-vm[p,t])) / beta[p,1])
shape2p[p,t] = (1-vp[p,t]) * ((vp[p,t] * (1-vp[p,t])) / beta[p,1])
shape2m[p,t] = (1-vm[p,t]) * ((vm[p,t] * (1-vm[p,t])) / beta[p,1])
}
}trying to use pbeta here (derives the distribution function givemn a set of probabilities)
For now, setting probabilities between 0 and 1 and taking the average…
use the simulated ratings per person that we have derived using our paramters and see how well they align with the real ratings…
Only showing the diaganols from corr.test package here to get the important t1 x t1 etc values.
## [1] "real ratings with estimated ratings: CS MINUS"
## <0 x 0 matrix>
## [1] "real ratings with estimated ratings: CS MINUS (average for all trials)"
## [1] 0.3955608
## [1] "real ratings with estimated ratings: CS PLUS"
## X1 X2 X3 X4 X5 X6
## -0.05332339 0.06200098 0.13674389 0.14012771 0.09619575 0.17409648
## X7 X8 X9 X10 X11 X12
## 0.21752246 0.20111841 0.16258970 0.19958869 0.10088291 0.29599294
## [1] "real ratings with estimated ratings: CS PLUS (average for all trials)"
## [1] 0.2092603
Here we are seeing if we can recover the same estimates using the simulated ratings. Basically run stan but using the estimated ratings instead of the real ones. See if we get the same alpha / beta paramters.
RL model adding a beta per stimulus to Alex’s model
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_mean1beta_PunSens.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(rating_est_plus),ratingsMinus=t(rating_est_minus),cdf_scale=cdf_scale)
flare_fit_rec <- stan(file = stanfile, data = flare_data, iter=chain_iter,warmup = warm_up, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_mean1beta_PunSens' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.006155 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 61.55 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: WARNING: There aren't enough warmup iterations to fit the
## Chain 1: three stages of adaptation as currently configured.
## Chain 1: Reducing each adaptation stage to 15%/75%/10% of
## Chain 1: the given number of warmup iterations:
## Chain 1: init_buffer = 15
## Chain 1: adapt_window = 75
## Chain 1: term_buffer = 10
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 101 / 400 [ 25%] (Sampling)
## Chain 1: Iteration: 140 / 400 [ 35%] (Sampling)
## Chain 1: Iteration: 180 / 400 [ 45%] (Sampling)
## Chain 1: Iteration: 220 / 400 [ 55%] (Sampling)
## Chain 1: Iteration: 260 / 400 [ 65%] (Sampling)
## Chain 1: Iteration: 300 / 400 [ 75%] (Sampling)
## Chain 1: Iteration: 340 / 400 [ 85%] (Sampling)
## Chain 1: Iteration: 380 / 400 [ 95%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 383.151 seconds (Warm-up)
## Chain 1: 709.201 seconds (Sampling)
## Chain 1: 1092.35 seconds (Total)
## Chain 1:
Use the summary of the stan model to extract the different paramters we want to try to use to recreate our data.
use the simulated ratings per person that we have derived using our paramters and see how well they align with the real ratings…
Only showing the diaganols from corr.test package here to get the important t1 x t1 etc values.
## [1] "original with recovered: ALPHA"
## [1] 0.9154219
## [1] "original with recovered: BETA"
## [1] 0.08026113
## [1] "original with recovered: LAMBDA"
## [1] 0.8897418
A parameter that represents the rating consistency for multiple repeated / similar trials. I think it would be best to have one each for the CS+ and CS- given these differ in terms of how similar the trials are (CS- is always un-enforced for example). Can imagine consistency is a paramter that is concistent regardless of reinforcement / stimulus type though, especially in later phases. So worth testing both models.
A similar parameter is used in the charpentier et al. paper (see the last page before the references).
We will estimate this parameter as a factor that influences the overall shape of the choice probability distribution (beta distributiuon). It will do this via the sufficient parameters that are influences by stimulus value etc per trial.
note very unsure about this - need to check it out with Alex
\[shape1(stimulus) = (1 + exp(-\mu * VPlus[t,p] * ((VPlus[t,p] * (1-VPlus[t,p])) / beta[p,1]))^-1 \] where \[\mu = logit\\sensitivty\]
Where logit sensitivity effectively means consistency of rating consistency; higher valued should mean greater consistency.
## decide testing rate (min,med,max or off)
testing('min')
## set up run
stanname='beta_mean1beta_Consistency.stan'
stanfile <- file.path(scriptdir, stanname)
flare_data<-list(ntrials=ntrials,nsub=nsub,screamPlus = t(screamPlus), screamMinus= t(screamMinus),ratingsPlus=t(plus_scaled),ratingsMinus=t(minus_scaled))
flare_fit <- stan(file = stanfile, data = flare_data, iter=chain_iter, chains = chain_n) #add working dir?##
## SAMPLING FOR MODEL 'beta_mean1beta_Consistency' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.010834 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 108.34 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 400 [ 0%] (Warmup)
## Chain 1: Iteration: 40 / 400 [ 10%] (Warmup)
## Chain 1: Iteration: 80 / 400 [ 20%] (Warmup)
## Chain 1: Iteration: 120 / 400 [ 30%] (Warmup)
## Chain 1: Iteration: 160 / 400 [ 40%] (Warmup)
## Chain 1: Iteration: 200 / 400 [ 50%] (Warmup)
## Chain 1: Iteration: 201 / 400 [ 50%] (Sampling)
## Chain 1: Iteration: 240 / 400 [ 60%] (Sampling)
## Chain 1: Iteration: 280 / 400 [ 70%] (Sampling)
## Chain 1: Iteration: 320 / 400 [ 80%] (Sampling)
## Chain 1: Iteration: 360 / 400 [ 90%] (Sampling)
## Chain 1: Iteration: 400 / 400 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 39.8191 seconds (Warm-up)
## Chain 1: 19.324 seconds (Sampling)
## Chain 1: 59.1431 seconds (Total)
## Chain 1:
## Warning: There were 200 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Warning: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
## http://mc-stan.org/misc/warnings.html#bfmi-low
## Warning: Examine the pairs() plot to diagnose sampling problems
# extract fit data
summary_flare_con <- summary(flare_fit)
# extract model summary data
#flare_loglike<- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)## [1] "400 iterations on 1 chains. "
## [1] "Estimated 99 Free paramaters per person"
## [1] "This table is very large. Returning only the top 6 entries unless you have set the 3rd function option to 'all'. "
The alpha parameter variance is normal (mean 0.4 and sd 0.12). Beta is much more bounded now though (combined across both stimuli mean 0.79, sd=1.6) over 4000 iterations on 4 chains.
## extract log likelihood
flare_loglike_con <- extract_log_lik(flare_fit, parameter_name = "loglik", merge_chains = TRUE)
#calculate BIC
FLARe_bic<-bic(ntrials,-colMeans(flare_loglike_con),2) #number of parameters in that model e.g. 4)
## mean BIC as model comparisons tool:
print("Mean Bayesian information criterion for model")## [1] "Mean Bayesian information criterion for model"
## [1] 127.2152
this is similar to Tobys ‘leaky betas’ I think…
Basically here we want to capture a parameter that estimates how much the learning from the reinforced stimulus influences responses to the ‘safe’ stimulus.
this paramter is hopw much they retasin what they learned over previous trials and use it to inform the current rating.
investigte change point detection paramters (when reinforcement changes - i.e. moving acquisition to extinction) could do this or model the phases separately - check which best fits
add priors! These are what you expect the group to look like (i.e alpha is normally distributed around a mean of 0.5 with variance of 10 or something) LOOKUP R stan choice of priors. * can have informative or uninformative priors (i.e. agnostic or not)
Avoidance as screams!!!!!
Add parameters
DO NOT FORGET TO MAKE SURE WE HAVE AN ACCURATE SCREAM PATTERN PER PERSON FOR CS+ IN ACQUISITION
Unhash the series below if you made any changes.